Email Marketing Analysis

Purpose

The marketing team of an e-commerce site has launched an email campaign. This site has email addresses from all the users who created an account in the past.

They have chosen a random sample of users and emailed them. The email let the user know about a new feature implemented on the site. From the marketing team perspective, a success is if the user clicks on the link inside of the email. This link takes the user to the company site.

You are in charge of figuring out how the email campaign performed and were asked the following questions:

  1. What percentage of users opened the email and what percentage clicked on the link within the email?

  2. The VP of marketing thinks that it is stupid to send emails to a random subset and in a random way. Based on all the information you have about the emails that were sent, can you build a model to optimize in future email campaigns to maximize the probability of users clicking on the link inside the email?

    By how much do you think your model would improve click through rate ( defined as # of users who click on the link / total users who received the email). How would you test that?

  3. Did you find any interesting pattern on how the email campaign performed for different segments of users? Explain.

Data Description

email_id : the Id of the email that was sent. It is unique by email

email_text : there are two versions of the email: one has “long text” (i.e. has 4 paragraphs) and one has “short text” (just 2 paragraphs)

email_version : some emails were “personalized” (i.e. they had the name of the user receiving the email in the incipit, such as “Hi John,”), while some emails were “generic” (the incipit was just “Hi,”).

hour : the user local time when the email was sent.

weekday : the day when the email was sent.

user_country : the country where the user receiving the email was based. It comes from the user ip address when she created the account.

user_past_purchases : how many items in the past were bought by the user receiving the email

Setup

Library import

We import all the required Python libraries

Local library import

We import all the required local libraries libraries

Parameter definition

We set all relevant parameters for our notebook. By convention, parameters are uppercase, while all the other variables follow Python's guidelines.

Data import

We retrieve all the required data for the analysis.

Data processing

Put here the core of the notebook. Feel free di further split this section into subsections.

Data Preprocessing

Email Click and Open Analysis

In an email marketing campaign, some of the metrics using to measure the performance should be average email open rate, average click through rate, click to open rate. However, those metrics hugely depend on the content of the email and vary by the industry. For example, email promotion will have higher open and click rate than product introduction email. In our scenario, the email campaign deliver the information of the new feature implemented on the site which is not really attractive to the users intuitively. Therefore, the expected CTR(click through rate) will not really high. Let's look at some of the reference value below:

A high-level overview of overall email marketing statistics for 2020:

  1. Average open rate: 18.0%
  2. Average click-through rate: 2.6%
  3. Average click-to-open rate: 14.1%
  4. Average unsubscribe rate: 0.1%

Sources: https://www.campaignmonitor.com/resources/guides/email-marketing-benchmarks/

Comment: From the chart above, we could see that among 100000 sent emails, around 10% opened, and 2119 click on the link (CTO - Click To Opened - is 10%), CTR - Click Through Rate - is 2%. Based the reference values above we could say that those ratio indicate a good performance. It could be that the content is quite well-designed, well-personalized to the readers.

In the next steps, we will dive deeper to the other analyses to understand more the behavior of the email receivers.

Question: Which days of week do people tend to open the email?

Comment: People tend to open email during the weekdays rather than during the weekend.

Question: During Which days of week do people tend to open the email?

Comment:: in general, we could see that there is a high correlation between the number of open count and click count

1. the personalized email has higher CTO(Click to opened) than general content  
2. the short email converted more clicks than the long email  
3. Wednesday(no.1), Tuesday(no.2), Monday(no.3) has more clicks than the other days, although the number of opened email are pretty identical

Question: During Which hour of the day do people tend to click the short personalized email?

Comment: It seems like the email receiver tend to open the email when they start the work from 8-11 am.

Comment: The larger the amount of purchases, the higher the number of click ratio. The reason could be those people are become the loyal customers of the website so they are more likely updating the related information. While the customer with low frequency tend to ignore the email, indeed they are the group of customer which easily switch to the rivals

Value and Actions:

From the analysis above, we could see the pattern of customer behavior to optimize the email campaign better by:

Techniques:

  1. Sending the emails on weekdays, between 6 am to 7 am before people start working, so it could be on the top list of the mailbox, which on the other hand could increase the opening rate.

  2. Working closely with the customer service department to take care the group of loyalty customer (the one who made more than 8 to 10 purchases , besides increasing the frequent purchase of the customer who made fewer transactions. -> The value of that action is we could reduce the cost of the email campaign by stopping send those kind of email - announcing new features on the website - to group of customer with a few purchases(the email service providers could charge based on the number of emails sending out) but sending the emails with promotions and offers to let them more familiar with the product and service.

Content: The content should be short and well-personalized to the reader with a clear call-to-action. We could achieve those goals by:

  1. Doing more A/B testing with different types of content
  2. Improving the recommendation algorithm to find the best matches with the customer need at the right time ( what they might need in the next 3 months, 6 months..to upsell and take care of customer)
  3. Improving the social listening
  4. Review the performance of content writers

Personal thought:

Build Predictive Model

Splitting the data

Model building

Notes: the model we are going to employ is CatBoost because of its advantages:

  1. Working well with imbalanced dataset and categorical features
  2. High training speed (using GPU)

Reference Link: https://catboost.ai/

Notes: Since the data is imbalanced so we will apply more weight on the minority class (in this case: the value "1") of "clicked". And the weight is calculated as below

Comment: From the outcome of the model, we could see that by sending all the email classified is "1" (means customer with click in the link) and stop sending all the email predicted as 0, we could increase the click rate to 4.6% (Recall rate)

Comment: I could be seen that among the respond variables, user_past_purchases and weekday has the most impact to the click ratio, therefore when doing A/B testing we should focus those variables in advance

What we can do to improve the model and what's next

Next step: Using the insights discovered above, we could strategically target to users by segment (user with past purchase less than 8 and more than 8). But on top of that, personalizing the email combining with tailoring the email content could also be considered when doing A/B testing.

References

We report here relevant references:

  1. https://github.com/carlssonnp/Optimizing-Email-Marketing
  2. https://www.programmersought.com/article/58566944189/